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Abstract 


A concurrent system consists of processes communicating via shared objects, such 
as shared variables, queues, etc. The concept of wait-freedom was introduced to cope 
with process failures: each process that accesses a wait-free object is guaranteed to get 
a response even if all the other processes crash. But what if these wait-free objects 
themselves fail? For example, if a wait-free object “crashes”, all the processes that 
access that object are prevented from making progress. In this paper, we introduce the 
concept of fault-tolerant wait- free objects , and study the problem of implementing them. 
We give a universal method to construct fault-tolerant wait-free objects, for all types 
of “responsive” failures (including one in which faulty objects may “lie”). In sharp con- 
trast, we prove that many common and interesting object types (such as queues, sets, 
and test&set) have no fault- tolerant want-free implementations even under the most 
benign of the “non-responsive” types of failure. We also introduce several concepts and 
techniques that are central to the design of fault-tolerant concurrent systems: the con- 
cepts of self-implementation and graceful degradation, and techniques to automatically 
increase the fault-tolerance of implementations. We prove matching lower bounds on 
the resource complexity of most of our algorithms. 


* Research supported by NSF grants CCR-8901780 and CCR-9102231, DARPA/NASA Ames grant NAG- 
2-593, grants from the IBM Endicott Programming Laboratory and Siemens Corp. 

* Also supported by an IBM graduate fellowship. 
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1 Introduction 


1.1 Background and motivation 

A concurrent system consists of processes communicating via shared objects. Examples of 
shared object types include data structures such as read/write register, queue , set, and 
tree, and synchronization primitives such as test&set, fetch&add, and compare&svap. 
Even though different processes may concurrently access a shared object, the object must 
behave as if all these accesses occur in some sequential order. More precisely, the behavior 
of a shared object must be linearizable ([HW90]), One way to ensure linearizability is to 
implement shared objects using critical sections [CHP71], This approach, however, is not 
fault-tolerant: The crash of a process while in the critical section of a shared object can 
permanently prevent the rest of the processes from accessing that object. This lack of fault- 
tolerance led to the concept of wait-free implementations of shared objects. Informally, a 
shared object is wait-free if every operation invocation on that object is returned a response 
even if some or all other processes in the system crash. 

Thus, a concurrent system in which all shared objects are wait-free is resilient to process 
crashes. However, such a system is not resilient to shared object failures. 1 For example, 
the “crash” of a single shared object stops all the processes that need to access that object. 
Motivated by this observation, we study the problem of implementing wait-free shared 
objects that are also fault-tolerant With such objects, the system is guaranteed to make 
progress despite process crashes and the failures of some underlying objects. To the best 
of our knowledge, the issue of fault-tolerant wait-free shared objects has not been addressed 
before. (To simplify notation, hereafter “object” denotes a “shared object”.) 


1.2 Object failures 

We classify object failures into two broad categories: Responsive and non-responsive . We 
require that objects subject to responsive failures continue to respond (in finite time) to 
operation invocations. The responses may be incorrect. La contrast, objects subject to 
non-responsive failures are exempt from responding to operation invocations. Such objects 
may “hang” on the invoking process. 

We divide responsive failures into three sub-classes: R-crash, R-omission, and R-arbitrary . 
An object subject to R-crash failure behaves correctly until it fails, and once it fails, it re- 
turns a distinguished response X to every invocation. As with R-crash, an object subject to 
R-omission failures may return the correct response or a X. However, even if it responds X 
to a process p, a subsequent operation invocation by a different process q may get a correct 
response. This behavior models an object O made of several components, some of which 
failed. The operation by p “ran into” a failed component of O, while the one by q only 
encountered correct components of O. Finally, objects subject to R-arbitrary failures may 
“lie”, i.e., return arbitrary responses to operation invocations. 

J Even “software” objects have underlying hardware components. The software and/or the hardware 
could be faulty. 
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Similarly, we divide non-responsive failures into crash , omission , and arbitrary . An 
object subject to crash failure behaves correctly until it fails, and once it fails, it never 
responds to operation invocations. An object subject to omission failures may fail to re- 
spond to the invocations of an arbitrary subset of processes, but continue to respond to the 
invocations of the remaining processes (forever). The behavior of an object subject to an 
arbitrary failure is completely unrestricted: it may not respond to an invocation, and even 
if it does, the response may be arbitrary. 

1.3 Fault- tolerant objects 

Let T be an object type and let £ = (Ti,T 2 , . . . , T n ) be a list of object types. A function 
X : T\ x T 2 x . . . x T n — ► T is an implementation of T from £ if O = T(ox, 02 , . . . , o n ) is a 
wait-free object of type T whenever 0* (1 < i < n) is a wait-free object of type T,. We call 
O a derived object (of I) and o,’s the base objects of 0. X is t-tolerant for a failure model 
Ai if 0 behaves correctly even if a maximum of t base objects of 0 fail according to Ai, 

The implementation X is a self-implementation if Ti = T 2 = . . . = T n = T. In 
other words, in a self-implementation the base objects are required to be of the same 
type as the derived object. For example, consider the object type 2-process queue (i.e., a 
queue that can be accessed by at most two processes). In Section 6.3, we show that (for 
every t) there is a t-tolerant self-implementation of 2-process queue for R-arbitrary failures. 
Intuitively, this means that using a set of wait-free 2-process queues, t of which are subject 
to R-arbitrary failures, we can implement a failure-free wait-free 2-process queue. Thus in 
a self-implementation fault-tolerance is achieved through replication. 

1.4 Results 

To study whether a general object type has a t-tolerant implementation, we focus on two 
particular object types: consensus 2 and register. Herlihy [Her91] and Plotkin [Plo89] 
showed that one can implement a wait-free object of any type using only consensus and 
register objects. Thus, if consensus and register have f-tolerant implementations, then 
every object type has a t-tolerant implementation. 

We first study the problem of tolerating responsive failures. We give t-tolerant self - 
implementations of consensus for R-crash, R-omission, and R-arbitrary failures. For 
R-crash and R-omission failures, our self-implementation is optimal requiring only t + 1 base 
consensus objects if t of them may fail. For R-arbitrary failures, our self- implementation is 
efficient requiring 0(t log t) base consensus objects. We also give t-tolerant self-implementations 
of register for R-crash, R-o mis sion, and R-arbitrary failures. Combining the above results 
with [Her91, Plo89], we conclude that every object type T has a t-tolerant implementation 
(from consensus and register) for all responsive models of failures. Moreover, if T im- 
plements consensus and register, then T has a t-tolerant self-implementation. This 

3 A consensus object supports two operations, propose 0 and propose 1, and satisfies the following two 
properties. An operation gets a response v only if there is some prior invocation of propose v. Further, the 
response is the same for all invocations of both operations. 
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implies that familiar object types such as (2-process) queue, stack, test&set, fetch&add, 
and (iV-process) compare&svap have t-tolerant self-implementations even for R-arbitrary 
failures! 

What about tolerating non-responsive failures? Unfortunately, the results are mostly 
negative. We show that there is no 1- tolerant implementation of consensus even for crash 
failures, the most benign of the non-responsive models of failures. 3 This immediately implies 
that any object type T that implements consensus (such as queue, stack, test&set, 
swap, compare&swap, etc.) has no 1-tolerant implementations for crash failures. In con- 
trast, we show that register has a t-tolerant se//-implementation even for arbitrary fail- 
ures. In addition to these universality and impossibility results, this paper contains the 
following results. 

Let 1 be a t-tolerant implementation for failure model Af. By definition, every derived 
object of X is guaranteed to behave correctly even if up to t base objects fail according to 
At. But what happens if more than t base objects fail? In general, the derived object may 
experience a more severe failure than At! We say a t-tolerant implementation for a failure 
model At is gracefully degrading if the failure of more than t base objects (according to 
At) cannot cause the derived object to experience a more severe failure than At. From a 
1-tolerant gracefully degrading self-implementation of any object type T for a failure model 
At, we show how to recursively construct a t-tolerant self-implementation of T for At. This 
provides a method for automatically increasing the fault-tolerance of an object. 

In general, graceful degradation increases the cost of an implementation. For instance, 
consider t-tolerant implementations of consensus for R-omission failures. As already men- 
tioned, there is such an implementation using only t + 1 base objects. However, this im- 
plementation is not gracefully degrading. In fact, we show that, in this case, graceful 
degradation requires at least 2t 4- 1 base objects, and we give a matching algorithm. 

We prove that there is a large class of object types that have no gracefully degrading 
implementations for R-crash. Intuitively, this means that whatever the implementation, 
the failure of the implemented object will be more severe than R-crash, even if all its base 
objects can only fail by R-crash. 

We study the problem of translating severe failures into more benign failures [NT90]. 
In particular we show that given 3t + 1 (base) consensus objects, at most t of which are 
subject to R-arbitrary failures, we can implement a (derived) consensus object that can 
only fail by R-omission. We also show that this translation from R- arbitrary to R-omission 
is resource optimal. 

We also show that arbitrary failures can be viewed as having two orthogonal compo- 
nents: omission and R-arbitrary. Specifically, for any object type T, given any t-tolerant 
self-implementations I f and T f of T for omission failures and R-arbitrary failures respec- 
tively, we show how to construct a t-tolerant self implementation of T for arbitrary failures. 
This decomposition simplifies the problem of tolerating arbitrary failures. 

* The impossibility of implementing a fault- tolerant consensus object from any finite list of base objects, 
one of which may crash, is shown using the impossibility of solving the consensus problem among a finite 
number of processes , one of which may crash [LAA87]. 
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2 Preliminaries 


A concurrent system consists of processes communicating via (shared) objects. A process 
interacts with an object by invoking an operation, and receiving a corresponding response 
from the object. Processes may exhibit arbitrary variations in their execution speeds. 
Further processes may crash. That is, a process may stop at any point in its execution and 
never take any steps thereafter. 

An object is specified by a type . An object type T is defined by N(T ), OP(T) and 
A(T), where N{T ) is the maximum number of processes that may access an object O of 
type T, OP(T) is the set of operations supported by (9, and A(T) specifies how O behaves 
when these operations are applied sequentially . For concreteness, we assume A(T) is a 
finite /in fini te state non-deterministic automaton where some states are designated as initial 
states. There is a transition from state s to state t labeled (op, v) iff invoking the operation 
op when the object is in state s may leave the object in state t, returning the response v . 
We say a sequential execution S = (opi, z>i), (op 2 , V 2 ), . . . , (op*., vjfe) from state s is consistent 
with T iff, viewing A(T) as a directed graph with states as nodes and transitions as directed 
edges, there is a directed path labeled S from state s. Further <S is consistent with T if 
there is some initial state s of T such that S from s is consistent with T . 

Each process may have at most one pending invocation on any given object. That is, a 
process p cannot invoke an operation on an object O unless the previous operation of p on 
object O has already received a response. However, operations from different processes may 
overlap on an object. The sequential specification is therefore not sufficient to understand 
the behavior of an object. We use linearizability defined by Herlihy and Wing [HW90] 
as the criterion for the correctness of an object. Informally, linearizability requires every 
operation execution to appear to take effect instantaneously at some point in time between 
its invocation and response. We make this more formal below. 

Let O be an object shared by the processes p*, i = 1, JV\ Let Et be an execution of 
the concurrent system (pi,p 2 , - * ,PiV> O) up to time t. Define H(Et ), the history of the 
execution Et, as follows: (pi,op,v,t 3} t e ) € H{E t ) iff process p< invokes operation op in E% 
at time t Sy and that operation completes at time t e returning the response v. Further, 
*,£*,oo) € 'H(Et) iff process p* invokes operation op in E't at time and that 
operation does not complete by time t. We say 'H(Ef) is linearizable with respect to type 
T if and only if there exist a sequence S of (operation, response) pairs and a one-to-one 
correspondence / from H{E t ) to S satisfying the following: 

• S is consistent with respect to T. 

• \S\ = \H(Et)\, i.e., there are exactly as many elements in the sequence <5 as there are 
in the set H(E t ). 

• If 7 ] = (pj,op,u,f 5 ,t e ) € H{Et) and f(rj) = Sj, then Sj = (op,v). (Here Sj denotes 
the j th> element of the sequence 5.) 

• If 7] = (p»,op, o) € H{Et) and /( 77) then Sj = (op,v) for some v € 2. 
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If T)' = (P>, op'- v', t' s ,t' e ),Tj" = (pj, op", v", t", t”) € W(-Et), and t' e < t", then f(rj') = S k , 
and /(? f) = Si for some k < l. 


An object O is of type T if for every t, and every execution E t of the concurrent system 
(p\ } p 2 , . . . } piv, O) up to time t, (!£*) is linearizable with respect to T, We say that that T 
is an N -process type, if N = N(T ). Any object of an N-process type is an N -process object. 

Objects are either primitive or derived. A primitive object is completely “external” to 
the invoking process. In other words, after a process invokes an operation on a primitive 
object, it may simply wait for the object to return the response. In contrast, a derived object 
O is “implemented” in software from base objects (each one of which is either derived 
or primitive). Such an implementation provides a procedure Apply {pi, op, O) (for each 
op £ OP(T) and 1 < i < N(T)) that process pi must execute in order to invoke an operation 
op on O and receive the corresponding response from O . Each step in Apply (pi,op,0) is 
either an invocation on a base object of O , or checking if a base object has returned a 
response to a previous. invocation 4 , or some local computation. 

We now define wait-freedom for primitive and derived objects. A primitive object is 
wait-free if every operation invocation by every process gets a response in finite time. A 
derived object O is wait-free if Apply (pi,op,0) (for each op £ OP(T) and 1 < i < N(T)) 
returns a response in a finite number of steps, regardless of the execution speeds of the 
remaining processes. Unless mentioned otherwise, all the objects considered in this paper 
are wait-free. 


3 Models of failure 

An object is only an abstraction with a multitude of possible implementations. For instance, 
it may be implemented as a hardware module in a tightly coupled multi-processor system, 
or as a server machine in a message passing distributed system. Whatever the implementa- 
tion, the reality is that hardware components sometimes fail, and when this happens, the 
implementation fails to provide the intended abstraction. 

Object failures may lead to unsatisfactory system behavior. For instance, the “crash” 
of an object prevents the progress of all processes that access the object. Similarly, if the 
object returns “incorrect” responses, the system behavior also becomes incorrect. It is 
therefore important to implement derived objects that behave correctly even if some of the 
base objects of the implementation fail. The cost and the complexity of such a fault-tolerant 
implementation depends on the failure model , i.e., the manner in which a failed base object 
departs from its expected behavior. In this paper, we define a spectrum of failure models 
that fall into two broad classes: Responsive and non-responsive. 

4 Note that pi does not “block” for the response from the object; It only “polls” for the response, then 
proceeds to the next step. 
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3.1 Responsive models of failure 


An object subject to responsive failures responds to every operation invocation. The re- 
sponse is possibly incorrect, but the object never fails to respond. We describe below three 
increasingly severe models of responsive failures. 


3.1.1 R-crash 

R-crash is the most benign model of object failure. This model is based on the premise 
that an object detects when it becomes faulty. Informally, an object subject to R-crash 
behaves correctly until it fails, and once it fails, it returns a distinguished response X to 
every operation invocation. More precisely, an object O of type T subject to R-crash failure 
satisfies the following three properties. Let E t be any execution of the concurrent system 
(p x ,P2, • • « ,PN) O ) up to time t, and (J? t ) be the corresponding history, as defined before. 

1. O is wait-free. 

2. If ( p,op, <J>', op', v',t' a ,t' e ) € and t e < t' s , then v' = 1. 

3. Let H'{E t ) = H{E t ) - {(p,op, L,t„t e ) € H{E t )}. Then H'{E t ) is linearizable with 
respect to T. 


3.1.2 R-omission 

Suppose O is a wait-free object implemented from some “hardware components”. We 
informally argue that O may exhibit a more severe failure than R-crash, even if one of its 
“hardware components”, say /, fails by R-crash. If a process p executes an operation op 
on O that accesses /, / returns X to p, causing p to return X for op. Suppose a different 
process q later executes some operation op ' on O and op* does not require q to access /. 
Process q does not “notice” the failure of /, and thus completes op' returning a non-1 
response. This violates the “once X, everafter J.” property of R-crash. 

Suppose that after p gets X it does not access O again. To g, this scenario is indistin- 
guishable from one in which p had crashed just before accessing /. Since the implementation 
of O from its components is wait-free, it is designed to tolerate p’s apparent crash, and the 
non-X response to q must be correct. 

In view of these considerations 5 , we formalize the R-omission model of failure as follows. 
An object O of type T subject to R-omission failures satisfies the following properties. 

1. O is wait-free. 

2. Let Et be any execution of (pi,P2> ■ • • O) up to time t with the following property: 
If a process pi gets a response X from O for some invocation in Et , then pi does not 

5 A formal justification for the R-omission model is given in Section 8. 
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invoke any operation on 0 subsequently in E t . Defining HiEt) as before, obtain 
by replacing every tuple of the form (p, op, J_, t Sy t € ) by (p, op, *, oc). Then 
T-C(Et) is linearizable with respect to T. 


3.1,3 R-arbitrary 

An object subject to R-arbitrary failures is free to return arbitrary responses to operation 
invocations. The only property we require from such an object is that it be wait-free. 

3.2 Non-responsive models of failure 

Each responsive model of failure has its non-responsive counter-part. The difference lies in 
the fact that an object subject to a non-responsive failure model may also fail to respond 
to operation invocations. 


3.2.1 Crash 

Crash is the most benign of all non-responsive models of failure. Informally, an object 
subject to a crash failure behaves correctly until it fails, and once it fails, it never responds 
to any operation invocations. More precisely, an object O of type T subject to a crash 
failure satisfies the following properties. 

1. If in a (temporally) infinite execution of the concurrent system (pi,p 2 , • - * ,pjv, O), O 
never responds to an invocation of some process p*, then the total number of responses 
from O in that (temporally) infinite execution is finite. 

2. If Et is any execution of the concurrent system (pi,p 2 , • • • ,PN i O) up to time t, and 
li{Et) is the corresponding history, then H(Et) is linearizable with respect to T. 


3.2.2 Omission 

Omission failures axe more severe than crash. An object subject to omission failures satisfies 
only property 2 of the crash model. 

3.2.3 Arbitrary 

An object subject to arbitrary failures is not required to satisfy any properties at all. Thus 
the behavior of such an object is completely unrestricted. In particular, the object may 
choose not to respond to an invocation. Even if it does, the response can be arbitrary. 
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4 Definition of fault-tolerant implementations 


Let T be an object type and let C = (Ti, T2, . . . ,T n ) be a list of object types (TVs are not 
necessarily distinct). A function X ; T\ x Tz x . . . x T n — ► T is an imp/emenfation 0/ T 
from C if O = I(oi ? 02, . - - , o n ) is a wait-free object of type T whenever 0{ (1 < i < n) is 
a wait-free object of type TV We call O a derived object (ofX) and oVs the base objects of 
Q. The resource complexity of X is n, the number of base objects that make up a derived 
object of the implementation. X is t-tolerant for a failure model Ad if O behaves correctly 6 
even if a maximum of t base objects of O fail according to Ad. Note that, in general, if 
more than t base objects fail according to Ad, O may experience a more severe failure than 
Ad. We say that X is gracefully degrading if: when base objects only fail according to Ad, 
O is only subject to failures of type Ad. 7 

The implementation X is a self -implementation if T\ = T2 = . . . = T n = T. In other 
words, in a self-implementation the base objects are required to be of the same type as the 
derived object. 


5 Some basic results 

Gracefully degrading self-implementations have the desirable property that they can be 
composed recursively to realize any extent of fault-tolerance. This is formalized in the 
following lemma. 

Lemma 5,1 (Booster Lemma) If a type T has a t-tolerant gracefully degrading self-implementation 
X of resource complexity n for a failure model Ad, then T has a ( t 2 + 2 t)-tolerant gracefully 
degrading self -implementation of resource complexity n 2 for Ad. 

Proof (sketch) Let X = A(oi,02, • • • >°«) ^(01, <>2, • • * jO n ). Define 

X f = A(oi, 02» • . - , On 3 ) . . • ,O n )jF{o n +i, . . . ,02n), . * - , f r (<>( n -l) n +l> * ■ ■ 5 O n i)). It is 

easy to verify that X f is a gracefully degrading ( t 2 + 2t)-tolerant self-implementation of T 
for Ad. a 

Recursive application of the booster lemma gives the following corollary. 

Corollary 5.1 If a type T has a 1 -tolerant gracefully degrading self-implementation of re- 
source complexity k for a failure model Ad, then T has a t-tolerant gracefully degrading 
self-implementation of resource complexity O(t lo * k ) for Ad. 

In Section 6.1.4, we illustrate how this corollary can be applied to construct a t-tolerant 
self-implementation of consensus for R- arbitrary failures. 

Our next result states that arbitrary failures have a responsive (R-arbitrary) and a non- 
responsive (omission) component. Thus the problem of tolerating arbitrary failures can be 

*That is, O remains wait-free and linearizable with respect to T. 

7 Even if all the base objects of O fail! 
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reduced to two strictly simpler problems: tolerating R-arbitrary failures and tolerating 
omission failures. 

Lemma 5*2 (Decomposability of arbitrary failures) A typeT has a t-tolerant self •implementation 
for arbitrary failures' if and only ifT has a t-tolerant self-implementation X f for R-arbitrary 
failures and X rf for omission failures . 

Proof (sketch) The “only if” direction is obvious. To prove the “if’ direction, suppose 
there areX' = A(oi,o 2 , - • • ,o TO ) Fa ( 01 , 02 , . * o m ) andX" = A(oi,o 2 ,. - ■ >o n ) Fo(oi, 02 , . - . ,o n ). 
Define X — A(oi, o 2 , . . . , o n7n ) Fq(Fa(oi, < . . , o m ), . . , , , . . , o nm )). It can be 
verified that X is a i-tolerant self-implementation of T for arbitrary failures. □ 


6 Tolerating responsive failures 

To study whether an arbitrary object type has a t-tolerant implementation, we focus on 
two particular object types: consensus and register. Herlihy [Her91] and Plotkin [Plo89] 
showed that one can implement a wait-free object of any type using only consensus and 
register objects. Thus, if consensus and register have t-tolerant implementations, then 
every object type has a t-tolerant implementation. 


6.1 Fault -tolerant implementation of consensus 

In the following, we first define the object type N-consensus. We then present a t-tolerant 
self- implementation of N-consensus that works for both R-cras h an d R-omission failures. 
This implementation requires t + 1 base iV-consensus objects, and is resource optimal. 
Following that, we show how to translate R-arbitrary failures of i^-consensus objects to 
R-omission failures. Our translation is also proved to be resource optimal. Although the 
above two results can be chained together to obtain a t-tolerant self- implement at ion of 
N-consensus for R-arbitrary failures, the resultant self-implementation is not resource effi- 
cient: it requires 0(t 2 ) base consensus objects. We therefore present an alternative efficient 
self-implementation of resource complexity O(tlogt). 

6.1.1 N-consensus object type 

The consensus problem for a system of N processes is defined as follows. Each process pi 
is given a binary input v* initially. The consensus problem requires each correct process 
to eventually reach the same (irrevocable) decision value d such that d € {vi, v 2 , • . - , 

The object type N-consensus is defined so that an object of this type makes the consensus 
problem solvable in a system of N processes. 

N-consensus is an iV-process type that supports two operations, propose 0 and propose 
1, and has the following sequential specification. If the first operation invoked is propose 
v, then every invocation (including the first) is returned the response v . Together with 
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linearizability, this sequential specification implies that anobject O is of type N-consensus 
iff it satisfies the following three properties: 

• Validity: O returns a response v 6 {0, 1} to an invocation (from process p) only if 
there is a prior invocation of propose v on O (by some process, possibly p itself). 

• Agreement : If O returns V\,V 2 to two invocations, and V\,V 2 € {0, 1}, then iq = U 2 . 

• Integrity : The response returned to an invocation by O is either 0 or 1. 

Let loc := Propose(p, v, O) denote that process p invokes propose v on O and stores the 
response returned in its local variable loc. 

6*1*2 Tolerating R-crash and R-omission failures 

We present a t-tolerant self-implementation of N-consensus for R-omission failures. Since 
R-omission failures are strictly more severe than R-crash, the same implementation also 
works for R-crash failures. 

A consensus object satisfies weak integrity if every response returned by the object is 
in {0,1, -L}. 

Proposition 6.1 Any N -consensus object that fails by R-omission satisfies validity , agree- 
ment, and weak integrity. Conversely, if a failed N -consensus object satisfies validity, agree- 
ment, and weak integrity , then the failure is R-omission . 

Proof Follows from the definitions. □ 


Oi,02y . - . ,Ot+i : N-consensus objects 

Procedure Propos e(p, v p , O) /* v v 6 {0,1} */ 
e$timate p , w, k : integer local to p 
begin 

estimate p := v p 
for k := 1 to t 4* 1 do 

w := propos e(p, estimate p , Ok) 
if w ^ J_ then estimate p := w 
return (estimate p ) 


Figure 1: t-tolerant self-implementation of N-consensus for R-omission 
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Theorem 6*1 Figure 1 gives a t-tolerant self -implementation of N-consensus for R-omission 
failures. The resource complexity of the implementation is t + 1 and is optimal . 

Proof (Sketch) 

Assume that at most t base objects fail by R-omission. We show below that the derived 
object O is a correct iV-consensus object. 

1. O satisfies validity : Using Proposition 6.1, and the fact that p does not change 

estimate p if a base object returns X, it is easy to verify by an induction on k that 
if estimate p equals some value u at any point, then there is a prior invocation (from 
some process <?) of Propose^, «, O). 

2. O satisfies agreement : Since at most t base objects fail, there is an Oj. (1 < k < t + 1) 
that does not fail. So Ok returns the same response w € {0, 1} to every process that 
accesses it. This implies that for all p that access 0&, estimate p = w when p completes 
the k th iteration of the loop, and due to Proposition 6.1, it never changes thereafter. 
Thus O returns the same response w to every p . 

It is obvious that O always returns 0 or 1, and that O is wait-free. 

Any t-tolerant self-implementation for R-omission failures must handle the case where 
t base objects fail (by R-crash) initially. It is therefore obvious that the resource complexity 
of t -h 1 of our self-implementation is optimal. □ 

The above (self) implementation is not gracefully degrading. For instance, suppose that 
v p = 0 and v q = 1, and the t + 1 base objects fail by R-crash initially. It is easy to see that 
O returns 0 to p and 1 to q . Thus O does not satisfy agreement, and by Proposition 6.1, 
the failure of O is more severe than R-omission. In fact, we will now show that 2t + 1 is 
both a lower and upper bound on the resource complexity of a t-tolerant gracefully degrading 
self-implementation of N-consensus for R-omission 8 . The self-implementation that requires 
2t + 1 base objects is given in Figure 2. 

Claim 6.1 Let v be the value of estimate p andV be the value ofV p at the end ofk iterations 
(1 < k < 2t + 1) of the for-loop of Propose (p,v p ,0) in Figure 2 . Then v € {0,1}, and 
VJ,[1..&] contains only X *$ and v’s. 

Proof By an easy induction on k. □ 


Theorem 6*2 Figure 2 gives a t-tolerant gracefully degrading self-implementation of N-consensus 
for R-omission . 

Proof Assume all failures of base objects are by R-omission. We first show that, even if 
more than t base objects fail, 0 satisfies validity, agreement, and weak integrity: 

* As will be shown later in Theorem 8.2, there is no gracefully degrading implementation of H- consensus 
for R-crash. 
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Oi,0 2 , * • * j 02*+i : N-consensus objects 


Procedure Propose(p, v pi O) /* v p € {0,1} */ 
Vp[l>>2 1 + 1], estimate p , u/, fc: integer local to p 
begin 

1 e5<ima<e p := v p 

2 for & := 1 to 2t + 1 do 

3 iy := propose^, e$timate p , Ojt) 

4 Vp[fc] := iu 

5 if (tw ^ -L)A(iu es£zma£e p ) then 

6 esh'marfe p := ru 

7 r p [l...(fc-l)] := (X,l,...,±) 

8 if V p has more than t l's then 

9 return(-L) 

10 else return (estimate p ) 
end 


Figure 2: f-tolerant gracefully degrading self-implementation of N-consensus for R-omission 


1. O satisfies validity: Using Proposition 6.1, and the fact that a process p does not 
change estimate p if a base object returns _L, it is easy to verify by an induction on 
k that if estimate p equals some value u at any point, then there is a prior invocation 
(from some process q) of Propose(g, u, O). 

2. O satisfies agreement : Suppose, for a contradiction, there exist two processes p and 
q such that Propose (p, v p , O) returns 0 and Propose(g,w q , O) returns 1. From Claim 
6.1, and lines 8, 9 of the algorithm, it follows that V p has at least t + 1 0’s at the end 
of the execution of Propose(p, v p , O) and V q has at least t + 1 l’s at the end of the 
execution of Propose^, u 9 ,0). This is possible only if there is a k (1 < k < 2t+l) such 
that Propose(p, estimate^ Ok) returned 0 and Propose(g, estimate q , Ok) returned 1. 
Thus Ok does not satisfy agreement. By Proposition 6.1, the failure of Ok is not 
R-omission, a contradiction. 

3. O satisfies weak integrity : Trivial to verify. 

4. O satisfies integrity if at most t base objects fail : Let O^ ,Ok 7 , • • ■ ,Ok, {k\ < ki < 
. . . < fc{) be all the correct base objects. Since at most t fail, we have / > t + 1. 
By the integrity and agreement properties of O*, , there is a u £ {0, 1} such that for 
all p, Propose(p, estimate^ O*, ) returns v. Thus for all p estimate p = v at the end 
of ki iterations of the for-loop in Propose (p,v p ,0). Using this and Proposition 6.1, 
it is easy to verify that at the end of the execution of Propose (p, u p , 0), V p [fcj]= v 
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and estimate p = v for all p and for ail 1 < i < I. This implies, by lines 8, 9 of the 
algorithm, that Propose (p, v py O) returns v . 

Prom 1, 2, and 4 above, we conclude that the self-implementation is t-tolerant for R- 
omission. From 1,-2, and 3 above, together with Proposition 6.1, we conclude that the 
self-implementation is gracefully degrading for R-omission. □ 

Theorem 6.3 The resource complexity of any t-tolerant gracefully degrading implementa- 
tion of N-consensus (N > 2) for R-omission is at least 2 1 + 1. 

Proof For a contradiction, assume that there is a t-tolerant gracefully degrading imple- 
mentation 1 from £ — {Ti,T 2 ,...,r n }, of N-consensus for R-omission, where n < 2 1. Let 
O = I(0i,0 2 , . . . ,O n ). Consider the following interleaving of processes p and q . 

Scenario 


1. Process p invokes Propose (p, 0, O ) and executes the steps of Propose (p, 0, C) until ei- 
ther it accesses exactly t base objects or it completes the execution of Propose (p, 0, Q) ) 
whichever is earlier. Let S p denote the set of base objects accessed by p. Every base 
object O 6 S p behaves correctly to p’s invocations. Note that |5 P | < t . 

2. Process q invokes and completes the execution of Propose^, 1, O), Let S q denote the 
set of base objects accessed by q y and T q = S q — S p . The base objects behave as 
follows: Every base object O € S p accessed by q returns J_ to q and undergoes no 
change in its state; every base object O 6 T q behaves correctly to <?’s invocations. So 
q sees at most \S p \ < t failures of base objects. 

3. Process p resumes execution (thus \S P \ = t), and completes any remaining steps of 
Propose(p, 0, O). The base objects behave as follows: Every O € T q accessed by p 
returns X to p; every O € S p — T q accessed by p behaves correctly to g’s invocations. 
Note that T q — S q - S p C {Oi , 0 2 , . . . , O n } — S p , and thus \T q \ < n - t < t. So p sees 
at most \T q \ < t failures of base objects. 


In a scenario such as the above, we assume that all steps in item k strictly precede 
every step in item k + 1. 

We make the following conclusions from the above scenario. 

1. From the characterisation of how the failed base objects behave, it is clear that all 
failures are by R-omission. Since I is gracefully degrading, the failure of O is no more 
severe than R-omission. Thus, by Proposition 6.1, O satisfies validity, agreement, and 
weak integrity. 

2. In the scenario described, neither process “knows” that the other process is also 
running. Thus, by validity and weak integrity, Propose(p, 0, O) must return either 0 
or X, and Propose(g, 1, O) must return either 1 or X. 
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3. In the scenario described, neither process sees more than t base object failures. Since X 
is t-tolerant, it follows that neither Propose^, 0, O) nor Propose^, 1,0) may return 
JL. Together with Conclusion 2, this implies that Propose(p, 0, O) returns 0 and 
Propose^, 1, O ) returns 1. Thus object O violates agreement (required by Conclusion 
1). We conclude that X is not a gracefully degrading t-tolerant implementation. 


□ 


6,1,3 Translation from R-arbitrary to R-omission 

A t-tolerant translation from a failure model M to a (less severe) failure model M l for object 
type T is a self-implementation T:TxTx...xT— such that O = X{o\ ) 02 , . . . , o n ) 
fails according to M! if a maximum of t base objects of O fail according to M (and the 
remaining base objects are correct). Note that if no base objects fail, by definition of an 
implementation, O does not fail either. 

In this section, we present a t-tolerant translation from R-arbitrary to R-omission for 
N-consensus. It is easy to see that this translation can be used along with the t-tolerant self- 
implementation for R-omission to obtain a t-tolerant self-implementation of N-consensus 
for R-arbitrary failures. This is the principal motivation for studying such a translation. 
We will also show that the resource complexity, 3t 4* 1, of our translation is optimal. 

Since a consensus object that suffers an R-arbitrary failure may return a non-binary 
response, we find it convenient to define f-propose(p,i;,0) as in Figure 3. 


Procedure f -propose(p, v, O) 
begin 

loc “ propose(p, v y O) 
if /oc € {0, 1} then 
return (loc) 
else return(O) 


Figure 3: Filtering an arbitrary response to a binary response 


Let O be the derived object of the translation in Figure 4. The base objects of O are 
A[ 1 . . . 2t + 1], B[ 1 . . . t]. In the following claims, assume that at most t base objects suffer 
R-arbitrary failures, and the remaining are correct. 

Claim 6.2 O satisfies weak integrity. Further, if no base object fails, O satisfies integrity . 


Claim 6.3 O satisfies validity. 
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j4[l . . . 2t + 1], 23(1 . . . t] : wait-free N-consensus objects 


1 

2 

3 

4 
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Procedure Propose(p, v p , O) 

coimt p [0..1], w, i, belie f p : integer local to p 
begin 

Phase 1: coimt p [0..1] := (0,0) 
for i := 1 to 2£ + 1 do 

w := f-propose(p, t7 p , A[i]) 
countp[w] := count p [w ] + 1 
Phase 2: Choose belie f p such that 

count p{belief p } > count p [belief p }. 
for i 1 to t do 

if belie f p ^ f-propose(p, beliefp, 23[i]) 
return(_L) 
exit Propose 

ietum(belie f p ) 

end 


Figure 4: f-tolerant translation from R-arbitrary to R-omission for N-consensus 


Proof Suppose O returns v € {0,1} to the invocation Propos e(p,v p ,0) (from process 
p). Then v = belie f p (by line 10), and count p [v) = count p [belief p \ > t + 1 (by line 5). So 
there is at least one correct base object A[i] such that propose ip, v p , A[i]') returned v. By 
validity of Aft], it follows that some process q invoked propose (g, v q , A[i]) where v q = v. 
This implies that q invoked Propose(g, v, O). □ 

Claim 6.4 O satisfies agreement. 

Proof Suppose O fails to satisfy agreement by returning € {0, 1} to some process p, and 
U 2 € {0, 1} to a different process q where i;i # 172- O returns Vi to p implies v\ = belie f p . 
S imil arly u 2 = belief q . Since ui ^ 172. we have belief p £ belief q . It is easy to verify that 
if all of .A[l . . . 2t + 1] are correct, then beliefp = beliefq. It follows that at least one of 
.A[l . . . 2t + 1] fails. - 

Further O returns vi to p implies for all 1 < t < t propose(p, beliefp, I3[i]) returns 
beliefp = ui to p. Similarly, for all 1 < * < t propose(g,fee/ie/„B[iJ) returns beliefq = t7 2 
to q. Thus Jill t base objects 23 [1 . . . t] fail by not satisfying agreement. Thus counting the 
failed j4[i]’s and 23(t]’s, we have more than t failed base objects, a contradiction. □ 

Together with Proposition 6.1, the above claims trivially imply the following theorem. 
Theorem 6.4 Figure 4 presents a t-tolerant translation from R-arbitrary failures to R- 
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omission failures for N-consensus. The resource complexity of the translation is 3t + 1. 


Theorem 6.5 The resource complexity of any translation I from R-arbitrary to R-omission 
for N-consensus is at least 3t + 1. 

Proof For a contradiction, assume the resource complexity of 1 is n < 3 1. We prove 
the theorem through a series of claims, involving “indistinguishable” scenarios. Let O = 
I(o \ , 02 , . . . ,o n ). In the following we say a process p touches a base object 0 { if during the 
execution of Propose(p, v p , <3), p executes propose(p, *,<?*). 

Claim 6.5 Suppose p executes Propose^, 0, O) to completion . If all base objects are cor- 
rect, then p touches at least t + 1 base objects . 

Proof Suppose the claim is false, and p touches only 0 { 1 , o,* 3 , . . . , 0 { m (m < t) before exiting 
Propose(p, 0, O). Since all base objects are correct, O satisfies validity and integrity. Hence 
Propose(p, 0, 0) returns 0. Now consider the following two scenarios. 

Scenario SI 

1. p executes Propose (p, 0, O) to completion touching only o tl , Oj 2 , . . . , Oi m (m < t). 
Propose(p, 0, O) returns 0. 

2. q executes Propose(g, 1,0) to completion. 

Scenario S2 

1. Oi x , Oi 2 , . . . , Oi m fail and behave as though they are touched by p exactly as in scenario 
SI. This is possible since m < t. 

2. q executes Propose(g, 1, 0) to completion. 

Since no base objects fail in Si, O behaves correctly. In particular, O satisfies integrity and 
agreement. Thus Propose(g, 1, 0) returns 0 in Si. Clearly SI S2 (We write SI S2 
to denote that Scenarios Si and S2 are indistinguishable to process g). So Propose(g, 1, O) 
returns 0 in S2 also, violating validity. By Proposition 6.1, this failure of O in S2 is not 
R-omission. Since fewer than t + 1 base objects fail in S2, the translation X is incorrect, a 
contradiction. □ 

Claim 6.6 Consider 
Scenario S3 


1. p executes Proposefp, 0,0) up to the point where it has exactly touched t base objects 



2. q executes Proposefa, 1, O) to completion . 

Then Propose fa, 1,0) returns 1. 

Proof Let 5 = {base objects touched by q } — {o*, Let Oj 1 , Oj 2 , . . . , Oj k be 

all the base objects in 5 arranged so that the first invocation of q on Oj t is before the first 
invocation of q on Oj l+z . Note that k < n — t < 2t. 

Let S2' represent scenario S2 when m = t. Since fewer than t + 1 base objects fail in 
S2 f , the failure of O cannot be more severe than R-omission. Hence, by Proposition 6.1, 
O satisfies validity and weak integrity in S2'. So Proposefa, 1, O) returns 1 or ± in S2'. 
Since S2' S3, we conclude Proposefa, 1, O) returns 1 or J. in S3. Further since no base 

object fails in S3, O satisfies integrity in S3. So Proposefa, 1, O) returns either 0 or 1 in 
S3. Together the above two conclusions imply the claim. □ 

Claim 6*7 Consider 
Scenario S4 


1. p executes Propose fa, 0, O) up to the point where it has exactly touched t base objects 

> * • • j Oit • 

2. Let Oj x , Oj 7 , . . . , Oj k be as defined above (note k < 2t). q executes Propose fa, 1, O) up 
to the point where it has touched exactly {oj 1 , Oj 2 , . . . ,Oj h _ t }. 

3. p completes the execution of Propose (p, 0, O). 

Then Propose (p, 0, O) returns 0. 

Proof Consider 
Scenario S5 

1. p executes Propose (p, 0, O) up to the point where it has exactly touched t base objects 

j j * • • ? &ii • 

2. The base objects Oj x , Oj 2 , . * . , Oj k _ t fail and behave as though they Eire touched by q 
exactly as in S4. 

3. p completes the execution of Propose(p, 0, O). 

Since k < 2t, the number of failed base objects in S5 = k — t < t, and therefore (by 
Proposition 6.1) O satisfies validity and weak integrity. So Propose(p,0, O) returns either 0 
or ± in S5. Since clearly S4 S5, Propos e(p, 0, O) returns either 0 or J_ in S4 Eilso. However 
since no base object fails in S4, O must satisfy integrity in S4. Thus Propose(p, 0, 0 ) returns 
0 in S4. □ 
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Claim 6.8 Consider 
Scenario S6 


1. p executes Propose /p,0, O) up to the point where it has exactly touched t base objects 

®ii ? °t*2 * • • * > ^tt - 

2 . g executes Propose (g, 1, O,) to completion , returning 1, by Claim 6.6. 

3 . Let Oj jy oj 2J , . . ) Oj k be as defined above (note k < 2t). {oj fc-t+1 ,Oj A _ a , . . . ,o Jfc } fail 
and behave as though they are never touched by q. 

4 . p comp/etfes t/ie execution of Propose (p, 0, (9^. 

Then Propose (p, 0, O ) returns 0. 

Proof Since S5 % p S6, Propose(p, 0, O) returns 0 in S6. □ 

From the above claim, it is clear that O does not satisfy agreement in S6. Hence, by 
Proposition 6.1, the failure of O in S6 is more severe than R-omission. Since fewer than 
t 4* 1 base objects fail in S6, the translation 1 is incorrect, a contradiction. This completes 
the proof of Theorem 6.5. □ 

6.1.4 Tolerating R-arbitrary failures 

Since N-consensus has a t-tolerant self-implementation for R-omission failures, and has a 
t-tolerant translation from R-arbitrary to R-omission failures, it follows that N-consensus 
has a i-tolerant self-implementation for R-arbitrary failures also. However the resulting self- 
implementation is expensive, requiring (Zt + l)(i + 1) base objects. Our main goal in this 
section is to present a t-tolerant self-implementation for R-arbitrary failures whose resource 
complexity is only 0(t log t). This implementation employs the divide-and-conquer strategy. 

In the following, we first present the base step: obtaining a 1-tolerant self-implementation 
(Figure 5). This requires 6 base consensus objects, while the above mentioned approach 
through translation requires 8 base conse nsu s objects. Then we show the recursive step of 
obtaining a t-tolerant self- implementation from a t/2-tolerant self- implementation (Figure 
6 ). 

Claim 6.9 If at most one of Oi, Oi+i, and Ot+2 (i = 1 or 4) fails, then an execution e of 
Access(p, Oi , Oj+i, Oj+ 2 , t;) (See Figure 5) returns v only if there is some other execution 
e f of Access(g, Oi, 0*+i, and 0{+ 2 , F) (for some q) that either precedes or is concurrent 
with e. 

Claim 6.10 If none ofO{ , Oj+i, andO {+ 2 (i = 1 or 4) fails, then , for allp andq, Access(p, 

Oi, Oi+i, Oi+ 2 > v p ) returns the same value as Access(g, Oi, Oi+i, 0,'+2> ^g) • 

Theorem 6.6 Figure 5 gives a 1 -tolerant gracefully degrading self-implementation of N-consensus 
for R-arbitrary failures . 


19 



Oi : N-consensus objects (1 < i < 6) 

Procedure Access(p, 0 1? 02, 03, v) 
coun£ p [0..1], u/: integer local to p 
begin 

counfp[0..1] := (0,0) 
for i := 1 to 3 do 

w := f-propose(p, v, 0;) 
count p [w] := count p [u;] + l 
if cotxnt p [0] > count p l 1] then 
retum(O) 
eke return(l) 

end 

Procedure Propose (p, v } O ) 
begin 

v := Access(p,0i,02,03> v) 
v := Access(p,04,05,06, v) 
return(v) 

end 


Figure 5: 1- tolerant self-implementation of N-consensus for R-arbitrary failures 


Proof Suppose that at most one of Oi (1 < i < 6) faik. Then either none of 0i,02, and 
03 faik or none of 04,05, and 06 fails. Validity of O follows from Claim 6.9. If none of 
04,05, and 06 faik, agreement of O follows from Claim 6.10. If none of 0i,02, and 03 
faik, agreement of O follows from Claims 6.9 and 6.10. It is obvious that O always returns 
0 or 1, k wait-free, and gracefully-degrading. □ 

Given the 1-tolerant gracefully degrading self-implementation in Figure 5, by ap- 
plying the Booster lemma (Lemma 5.1) we can obtain a t-tolerant self-implementation 
of N-consensus for R-arbitrary failures. However, the resulting resource complexity is 
0(t loga 6 ), which is even higher th an the complexity of the implementation through transla- 
tion mentioned above. We therefore present below an alternative efficient recursive strategy. 

See Figure 6. 

Theorem 6,7 Figure 6 gives at-tolerant (gracefully degrading ) self-implementation o/N- consensus 
for R-arbitrary failures of resource complexity 0(tlogt). 

Proof We prove the theorem through a series of claims. In all of them we assume that at 
most t base objects fail. 
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Ao[l . . . 3t + 1], Ai[l . . . Zt + 1], B[ 1 . . . 4t + 1] : (O-tolerant) N-consensus objects 

01 : -tolerant N-consensus object 

0 2 : L^J -tolerant N-consensus object 

Procedure Propose(p, v p , O) 

countp[0..1], WitnessCountp[ 0..1], belie f p ,ansl p , ans2 P , v' p , i, w : integer local to p 
begin 

1 count p [0..1], WitnessCouTit p [ 0..1] := (0,0, 0,0) 

2 Phase 1: for i := 1 to 3t + 1 do 

3 w := f-propose(p, v p , A Vp [i]) 

4 if w = v p then count p [v p ] := count p [v p ]+l 

5 Phase 2: ansl P i -propose (p, v p , Ox) 

6 Phase 3: for i := 1 to it + 1 do 

7 w := f-propose(p,an«lp,B[tj) 

8 W itnessC ount p [w] := WitnessCount p [w ]+ 1 

9 Phase 4: for i := 1 to 3t + 1 do 

10 w := f-propose(p,u p , 

11 if w = then count p \vp[ := count p [v^] + 1 

12 Phase 5: Choose belie f p such that WitnessCount p [belief p ] > W itnes sC ount p [belie f p \. 

13 if WitnessCount p [belief p ] > 3f + 1 and count p [belief p ] > 2t + 1 then 

14 retum(6e/ie/p); exit Propose 

15 if W itnes sCount p \belief p ] > 2t + l and count p [beiie f p ] > t + 1 then 

16 v'p := belie f p 

17 else v'p := v p 

18 ans2 p := propose(p,t;p, O 2 ) 

19 retum(ans2 p ) 
end 

Figure 6: Efficient <-tolerant self-implementation of N-consensus for R-arbitrary failures 
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Claim 6.11 If 0\ fails , then O 2 does not fail. 

Proof Since 0\ and O 2 are derived objects of -tolerant, and l^J-tolerant self- 
implementations of N-consensus respectively, 0 1 and O 2 tolerate up to and 

failed base objects respectively. Since at most t base objects fail, both 0\ and O 2 cannot 
fail, □ 

Claim 6.12 If 0\ does not fail, then O satisfies validity and agreement. 

Proof Suppose 0\ does not fail. Since a correct 0\ satisfies agreement, we have ans l p = 
ansl q = v for all p , q . Thus every process proposes the same value v to every B[i } in Phase 3. 
Since at most t objects in B[ 1 . . . it + 1] lie (fail), belief p — v and WitnessCount p [belief p }> 

34 + 1 (for every p). 

Also by the validity of 0\ y some process q will have invoked propose(g, v, 0\) before 
any process gets the response v from 0\. This implies that q will have finished Phase 
1 before any process begins Phase 3. Since at most t objects in A v [ 1 . . . 34 + 1] may lie, 
it follows that for all p, count p [v}> 2t + 1 by the end of Phase 4 of p. Thus we have 
Witnes$Count p [belief p ] > 34 + 1 and count p [belief p ] > 24 + 1 (for every p). Hence every p 
decides v (the proposal of g) by line 14. □ 

Claim 6,13 If 0\ fails, O satisfies validity and agreement. 

Proof Suppose 0\ fails. Then by Claim 6.11, O 2 does not fail. We need to consider two 
cases, 

CASE 1 Suppose some process p returns by line 14. This implies that WitnessCount p [belief p ] 
> 34 + 1 and count p [belief p ] > 2t + 1. Since at most t base objects may fail, it follows that 
WitnessCount q [belief p \ >24 + 1 and count q [belie f p ] >4 + 1 (for every q). This implies, by 
line 12, belie f q = belie f p , and let val = belie f p . Sine elViinessCount q lMiefq] > 2t + 1 and 
count q [belief q ] >4 + 1 (for every q), either q returns belie f q = val by line 14 and we have 
agreement between p and q , or q sets v q to belief q = val by line 16. Thus every q that does 
not return by line 14 proposes v f q = val on O 2 • Since O 2 does not fail, by validity of O 2 , 
ans2 q = v q = val , and q returns ans 2 q = val by line 19. Again we have agreement between 
p and q. 

To see that O satisfies validity, note that count p [belief p ] >24 + 1 implies that some 
process proposed belie f p — val on at least 4 + 1 objects in A^u e f p [ 1 ... 34 + 1]. 

CASE 2 Suppose no process returns by line 14. Then every q returns ans2 q by line 
19. Since O 2 does not fail, we have (for all p, q) ans2 p ~ ans2 q = val . Thus O satisfies 
agreement. 

By the validity of 02 > some process p must have proposed val to 02- That is v p = val. 

In the algorithm, v p equals either v p or belie f p . If v p — v p , then cleanly O satisfies validity. If 
v p = belie f p ^ v p , then p must have executed line 16. It follows that count p [belief p \> 4+1. 
This implies, considering that at most 4 objects in [ 1 ... 34 + 1] fail, that some process 
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q proposed v q = belie f p on some object in A^u e f p [1 . . . 3t + 1]. Thus val = = belie f p = v q 

and v q is the initial proposal of q. Thus O satisfies validity. □ 

Claim 6.14 The resource complexity of the implementation in Figure 4 is O(tlogt). 

Proof Denoting the resource complexity of the t-tolerant (gracefully degrading) self- 
implementation of N-consensus for R-arbitrary failures by /(£), we have the following 
recurrence: f(t) = 2f(t/2) -I- 2(3 1 + 1) + (4 1 -H 1) and /( 1) = 6. Hence the result. □ 

To complete the proof of Theorem 6.7, note that agreement and validity follow from 
Claims 6.12 and 6.13. It is obvious that the implementation is wait-free, gracefully degrad- 
ing, and that O satisfies integrity. □ 

6.2 Fault- tolerant implementation of register 

The register type supports two operations, read and write v. The sequential specification 
is simple: read returns the most recent value written. Lamport defined a weaker (non- 
linearizable) object known as safe register [Lam86]. In the following, we first show how to 
build a fault-tolerant safe register from safe registers, some of which may suffer R-arbitrary 
failures. We then resort to the register construction results in the literature to show that 
register has a self-implementation for R-arbitrary failures. 

Lemma 6.1 Using 2t + 1 1 -reader, 1 -writer safe registers , at most t of which may suffer 
R-arbitrary failures, we can implement a failure -free 1-reader, 1-writer, safe register. 

Proof (sketch) To read the safe register, the reader reads all base registers, and returns 
the majority response. If there is no majority, it returns an arbitrary value. To write a 
value v into the register, the writer writes v to all base registers. It is easy to verify that 
the above strategy implements a safe register that behaves correctly even if a maximum of 
t base registers suffer R-arbitrary failures. □ 

It is possible to implement a multi-reader, multi-writer, atomic register using 1-reader, 
1-writer, safe registers [Blo87, BP87, CW90, HV91, Lam86, NW87, Pet83, PB87, Sch88, 
SAG87, Vid88, Vid89, VA86]. Thus we have the following theorem. 

Theorem 6.8 register has a t-tolerant self-implementation for R-arbitrary failures . 


6.3 Universality results 

We now describe how to implement fault-tolerant wait-free shared objects of a generic type. 
An object type T is finite if A(T) has only a finite number of states. Also let N-consensus 
with reset be an N-process object type informally defined as follows: An object 0 of this 
type behaves exactly like an object of type N-consensus with the difference that O supports 
an extra operation reset Applying “reset” to O will initialize O and make it available for 
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a fresh round of consensus. The operation “reset” is required to work only in the absence 
of concurrent operations 9 . 

Herlihy showed that every finite object type 10 has an implementation from (N-consensus 
with reset, unbounded register) ([Her91]). The use of unbounded registers was re- 
placed by boolean registers by Plotkin ([Plo89]). Using Plotkin’s result, together with 
Theorems 6.7 and 6.8, we obtain the following corollary. 

Corollary 6.1 

• Every finite object type has a t-tolerant implementation from (N-consensus with 
reset, boolean register) for R-arbitrary failures. 

e If a finite object type implements N-consensus with reset and boolean register 
then T has a t -tolerant self -implementation for R-arbitrary failures . 


Herlihy ’s construction can be easily modified to yield a universal implementation from 
(N-consensus with reset, unbounded register) even for infinite object types. Thus 
Corollary 6.1 holds even if T is an infinite object type, provided that boolean register is 
replaced by unbounded register in the statement of the corollary. 

Herlihy showed that queue , stack, testiset , f etch&add etc . implement 2-consensus, 
and compare&swap implements N-consensus [Her91]. It is easy to show that test&set and 
compare&swap implement boolean register, and queue, stack, and fetch&add imple- 
ment unbounded register. Thus, 

Corollary 6.2 The following object types have t-tolerant self-implementations for R-arbitrary 
failures: (2-process) queue, stack, test&set, fetch&add, and (N -process) compare&swap. 


7 Tolerating non-responsive failures 

Unlike responsive failures, non-responsive failures are almost always impossible to cope 
with. We first show the impossibility of implementing a consensus object from any finite 
list of base objects , one of which may crash. We do so by a reduction from the consensus 
problem among a finite number of processes , one of which may crash. The latter problem 
is known to be unsolvable [FLP85, LAA87]. 

Theorem 7.1 There is no 1-tolerant implementation of 2-consensus for crash failures . 

9 Therefore N-consensus with reset cannot be defined modularly through sequential specification and 
lineariz ability. 

10 An object type T is finite if A(T), the automaton giving the sequential specification of T, has only a 
finite number of states. 
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Proof Suppose the theorem is false and there is a finite list C = {Ti,T 2 , . . . of object 
types such that there is a 1-tolerant implementation I of 2-consensus from C for crash 
failures. 

Now consider the following concurrent system 5 in which all objects are registers. Pro- 
cesses in 5 are {pi,p 2 }U {qj\l < j < /}, and the registers are {decision} U { invocation(i,j ), 
re$ponse{j, i) |1 < i < 2, 1 < j < l}. We claim that the consensus problem is solvable in S 
even if at most one process in 5 may crash. The following is the protocol. Let E {0, 1} 
be the input of pi. The idea is that process qj (1 < j < J) simulates an object Oj of type 
Tj, and process pi (i = 1,2) simulates the execution of propose(^) on the derived object 
The details are as follows. 

Initialize all registers to _L. The process pi simulates the execution of the proce- 
dure propose of the implementation I as explained below. If propose (v;) requires 
Pi to invoke some operation op on Oj, pi appends op to the contents of invocation(i, j). If 
propose (^i) requires pi to check if a response to some outstanding invocation on Oj has 
arrived, pi checks if a response has been appended (by qj) to response(j,i). If propose (v*) 
requires pi to decide some value v, pi first writes v in decision register, then decides it, and 
halts execution. Also pi periodically checks if the register decision contains a v € {0, 1}. If 
so, it decides v and halts execution. 

Process qj simulates the base object Oj as follows, qj checks the registers invocations^ j) 
and invocation(2, j) in a round- robin fashion. When it notices that some operation op has 
been appended to invocation^, j), it applies op to the local copy of Oj that it maintains 
and appends the corresponding response to response(j, i). Also g ; periodically checks if the 
register decision contains av 6 {0,1}. If so, it decides v and halts execution. 

It is easy to verify that the above protocol solves the consensus problem among the 
/ -f 2 processes in S even if at most one of them crashes. To see this, consider the following 
cases: 

1. No process crashes: Since every qj, the process simulating object Oj, is correct and 
propose executed by pi (i = 1,2) is a wait-free procedure, it follows that one of 
pi andp2 °r both eventually write a value v E {0, 1} into decision . Thus every correct 
process eventually decides v. 

2. pi crashes: By our assumption that at most one process crashes, process p 2 aud qj 
(1 < j < /), the process simulating object Oj , are all correct. Together with the fact 
that propose (v 2 ) is a wait-free procedure, this implies that p 2 eventually writes a 
decision value v into decision and decides v. Every other correct process eventually 
observes v in decision and decides v. 

3. P 2 crashes: By a symmetric argument, 

4. qk crashes (for some 1 < k < /): This corresponds to the crash of the simulated base 
object Ofc. Since J is 1-tolerant, the execution of propose(ut) by process pi (i = 1,2) 
eventually ter min ates. Thus one of pi and p 2 or both write a value v into decision. 
Thus every correct process eventually decides v. 
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In ail the above cases, since X is an implementation of 2-consensus the following holds: 
if both pi and p 2 write into the decision register, then they both write the same value, and 
this value is either v\ or V 2 - 

We showed that we can use X to solve the consensus problem in system 5, and this 
contradicts the impossibility result of Louis and Abu-Amara [LAA87]. □ 

We can strengthen the above result as follows. Suppose that at most one base object 
may fail, and it can only do so by being “unfair” (i.e., by not responding) to at most one 
process. Furthermore, suppose, the identity of this process is a priori “common knowledge” 
among all the processes. Even with this extremely weak model of object failure, called 
1 -unfairness to a known process , we can prove the following: 

Theorem 7*2 There is no 1-tolerant implementation of 2-consensus for 1-unfaimess to 
a known process . 

Proof (Sketch) Assume the theorem is false, namely, there is a 1-tolerant implementation 
of 2-consensus for 1-unfairness to process p\. Now proceed as in the proof of Theorem 7.1. 
Cases 1, 2, and 3 still hold. Consider Case 4, where g* crashes (for some 1 < k < l). This 
corresponds to the crash of the simulated base object This object is now potentially 
unfair to both pi and p 2 - But X tolerates unfairness to only p\. We circumvent this difficulty 
by modifying p 2 *s protocol as follows. If propose (^ 2 ) requires P 2 to invoke some operation 
op on some Oj, P 2 appends op to the contents of invocation( 2, j), as before, Fut_now it also 
waits until a corresponding response is appended to response(j } 2) (by process g ; ). n Thus, 
if p 2 attempts to access o* after the crash of g*, it will simply wait for the response forever. 
Therefore, at worst, the crash of q * looks like o * is unfair to pi, and P 2 is extremely slow. 
Since X tolerates the unfairness of one base object to pi, X{pi , . . . , 01 ) continues to behave as 
a wait-free consensus object. Hence the procedure propose^) executed by pi eventually 
terminates returning the decision value. As before, this value is 'written into decision , and 
eventually every correct process decides. Again, we have a contradiction to the impossibility 
result in [LAA87]. □ 

Let C be the class of all object types that can implement 2-consensus. From the above 
two theorems we have 

Corollary 7.1 For all T € C, there is no 1-tolerant implementation of T for crash or 
1 -unfairness to a known process. 

From [Her91] and this corollary, we conclude that Queue, Stack, Test&Set, Fetch&Add, 
Compare&Swap, and several other common types do not have a 1-tolerant implementation 
for crash or 1-unfaimess to a known process. In contrast to the above impossibility results 
we show 

Theorem 7.3 register has a t-tolerant self -implementation for arbitrary failures. 

11 It is easy to sec that with this modification Cases 1 , 2, and 3 still hold. 
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This follows from 


Lemma 7.1 Using 5f + 1 1-reader, 1-writer safe registers , at most t of which may suffer 
arbitrary failures, we can implement a failure-free 1-reader , 1-writer, safe register. 

Proof (Sketch) Informally, the reader invokes ‘read’ on all registers (on which it has no 
pending invocation) and waits until it + 1 respond. It then returns the majority value. If 
there is no majority, it returns an arbitrary value. The writer writes to all registers (on 
which it has no pending write). It waits until it + 1 of them return a “operation completed” 
response. It is easy to verify that the above strategy implements a safe register that works 
correctly even if a maximum of t base registers suffer arbitrary failures. □ 


8 Other basic results 

Consider a system that supports a given set H of primitive hardware objects. Assume that 
these objects may fail, but if they do, they are guaranteed to only fail by R-crash. Suppose 
we wish to build an object O using only objects in H, and O is only required to function 
correctly in the absence of failures. However, when objects in H fail by R-crash, we would 
like O to fail only by R-crash. This last requirement is desirable for two reasons: 

• The simple “once _L, everafter ±” property of R-crash is the most benign type of 
failure, 

• Such an object O appears like any other primitive hardware object of the system: 
With O , the system would be no different, in functionality and failure semantics, 
from one that supports H U {0} as its primitive hardware objects. 

In our terminology, a (0-tolerant) gracefully degrading implementation is exactly what 
we are looking for. The existence of such an implementation depends on the type of O and 
the types of the objects in H. Unfortunately, as we show below, most objects do not have 
such implementations even when H includes very powerful objects. 

An object type T is order- sensitive if it is a deter mini stic iV-process type (N > 2) and 
the following holds: There exist state 5 in A(T), operations op, op 1 (not necessarily distinct) 
in OP(T), and values u,v,u ! ,v f such that each of (op,u),(op f ,u') and (op\ v r ),(op, v) is a 
sequential execution from state 5 consistent with T, and u ^ v and u f ^ v\ Queue is an 
example of an order-sensitive object type. To see this, instantiate 5 to the state in which 
there are two elements 5 and 10 in the queue (5 in the front), and both op and op r to deq. 
Now we have u = 5, u ; = 10, v f = 5, and v — 10. Thus u ^ v and v! ^ v l , as required. 
Stack, TesttSet , Compare&Swap are some other examples of order-sensitive object types. 
An object type is non order-sensitive if it is deterministic and not order-sensitive. Examples 
of non order-sensitive types include register, sticky bit, move, and swap. 

Theorem 8.1 There is no (0-tolerant) gracefully degrading implementation of any order- 
sensitive object type for R-crash from any list of non order- sensitive object types. 
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Proof Omitted. 


□ 


Preserving the failures semantics of the underlying system is a highly desirable property 
of an implementation. For R-crash, the above theorem shows that this property is not 
achievable in many cases: implementations necessarily amplify the severity of the R-crash 
failures of the underlying system. For example, consider a system that supports registers 
and sticky bits in “hardware”. In such a system any object can be implemented [Plo89], 
including (for example) queues. Assume the given registers and sticky bits only fail by 
R-crash. Can we implement a queue that also fails by R-crash? The above theorem shows 
that this cannot be done! 

Requiring a derived object to inherit the R-crash semantics of its base objects is even 
more difficult if add the requirement that the derived object be 1-tolerant. Even if we do 
not restrict the types of primitives available in the underlying system, such implementations 
do not exist for most objects of interest! This is shown by the theorem below. 

Theorem 8.2 There is no 1-tolerant gracefully degrading implementation of any order- 
sensitive object type for R-crash. 

Proof For a contradiction, assume £ = {Ti,T 2 , . . . ,T n } is a list of types such that 
there is a 1-tolerant gracefully degrading implementation T of T from £ for R-crash. We 
prove the theorem through a series of claims, involving “indistinguishable” scenarios. Let 
O — I(0i,02, . . * O n ), and op , op', 5, u, v, u', v r be as given in the definition of order- 
sensitive types. 

Claim 8.1 Suppose O is in state S> and processes p and q execute Apply (p,op,Q) and 
Apply fayOpiyO) respectively. For any interleaving of Apply (p, op, O) and Apply (q, op\ 0) } 
either Apply (p, op, O) returns u and Applyfg, op', returns u f or Apply (p,op,0) returns 
v and Apply (q, op\ O) returns v* . 

Proof In the linearization of the execution history, either Apply(p, op, O) precedes Apply(g, op', O) 
or Apply (g, op', O) precedes Apply (p, op, O). This, together with the definitions of u, u', v, t/', 
and the fact that T is a deterministic type, trivially imply the claim. □ 

Claim 8.2 There exists a sequence a of steps (of p) and a step s (of p) such that the 
following Scenarios SI and S2 are possible. 

Scenario SI ( scenario starts with O in state S) 

1. Process p initiates and partially executes Apply (p,op,0) by completing the steps in 

a. 

2. Process q initiates and completes (all the steps of) Apply^ij, op', returning v' . 

3 . p completes the remaining steps o/ Apply (p, op, returning v. 

Scenario S2 (scenario starts with O in state S) 
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1. p initiates and (partially) executes Apply (p,op,0) by completing the steps in a.s. 

2. q initiates and completes ( all the steps of) Apply (q, op\ 0) ) returning u f . 

3 . p completes the remaining steps of Apply (p, op, 0), returning u. 


Proof Clearly if process p executes no steps of Apply (p,op,0) before process q initiates 
and completes Apply (q , op 1 , O), then Apply(g, op', O) must return v f . Further if p initiates 
and completes all the steps of Apply(p, op, O) (let (3 be this sequence of steps) before q 
initiates and completes Apply(g, op', O), then Apply (g,op',C?) must return u'. Together 
with Claim 8.1 by which Apply(g, op', O) must return either u f or v\ the above implies that 
there exists a sequence a of steps and a step s such that a.s is a prefix of (3 for which the 
claim holds. O 

Hereafter we will assume Oh is the base object accessed by p in step s. 

Claim 8,3 Consider 

Scenario S3 (scenario starts with O in state S) 

1. p initiates and (partially) executes Apply (p, op, O) by completing the steps in a.s. 

2 . q initiates and completes (all the steps of) Apply (q, op f , 0) } returning v! (as in S2). 

3 . Oi,0 2 ,..-,O n fail by R-cra$h. 

4. p completes the remaining steps of Apply (p ) op, O). 

Then Apply (p, op, O) returns u. 

Proof Suppose Apply (p,op, 0) returns ±. Since 1 is gracefully degrading, the failure 
of O must appear like R-crash. This requires, given that Apply (g,op',0) returns a non-± 
response, that Apply(g, op', 0) precede Apply(p, op, 0) in the linearization order. Doing 
so, however, implies that (op',u') is a sequential execution from 5 consistent with T. This 
cannot be true since u f ^ v', T is deterministic, and (op',i/) is a sequential execution from 
S consistent with T. Thus Apply (p, op, 0) cannot return _L, 

Suppose Apply(p, op, (2) returns w where JL ^ w ^ u. Since in the linearization, 
either Apply (p, op, O) precedes Apply (g, op', O) or Apply(g, op', O) precedes Apply (p, op, O ), 
it follows that either {op,w),{op r ,u J ) or (op', u'), (op, it/) is a sequential execution from S 
consistent with T. This cannot be true since T is deterministic and (op,u),(op l y u f ) and 
(crp f ,v r ),(op,v) are sequential executions from S consistent with T and w ^ u, u f ^ v f . 

We conclude that Apply (p, op, (9) must return u. □ 

Claim 8.4 Consider 

Scenario S4 (scenario starts with O in state S) 
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1. p initiates and ( partially ) executes Apply (p,op,0) by completing the steps in a.s . 

2. Ok fails by R-crash. 

3. q initiates and completes (all the steps of) Apply fg, op', 0,/. 

4 . O i ? .... Ok - 1 and . . . , O n also fail by R-crash . 

5. p completes the remaining steps of Apply (p, op, Q). 

Then Apply (p, op, O) returns u and Apply (q, op*, O) returns u f . 

Proof Clearly S4% p S3. Therefore, as in S3, Apply(p, op, 0) returns u in S4. Since I is 1- 
tolerant, and since only Ok has failed by the completion of Apply(g, op f , O ), Apply(g, op', O ) 
must return a non-J. response. From the definitions of u,u',v,r', it is easy to verify that 
the only non-J_ response that satisfies linearizability is u* . □ 

Claim 8.5 Consider 

Scenario S5 (scenario starts with O in state S) 

1 . p initiates and partially executes Apply (p } op,0) by completing the steps in a. 

2 . Ok fails by R-crash . 

3. q initiates and completes (all the steps of) Apply (q, op\ O). 

4. and 0fc+i, . . . , O n also fail by R-crash. 

5. p completes the remaining steps of Apply (p, op, O). 

Then Apply fp, op,0) returns u . 

Proof Clearly S5« 9 S4. Therefore Apply(g, op', 0) returns u' as in S4. By similar argu- 
ments as in Claim 8.3, it can be shown that Apply (p,op,0) returns u. □ 

Claim 8.6 Consider 

Scenario S6 ( scenario starts with O in state S ) 

1 . p initiates and partially executes Apply (p, op, O) by completing the steps in a. 

2. q initiates and completes (all the steps of) Apply (q, op\ O). 

3. All base objects Ou O 2 , • - • , O n fail by R-crash. 

4 . p completes the remaining steps 0 / Apply (p, op, 0^. 

Then Apply (p y op y O) returns u, and Apply (q, op\ O) returns v f . 
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Proof Since S6 S5, Apply (p, op, (9) returns u as in S5. Since S 6 SI, Apply (g, op', (9) 
returns as in SI. □ 

Neither (op, u),(vp f , v') nor (op r ,v f ),(op, u) is a sequential execution from 5 consistent 
with T. Hence the execution in Claim 8.6 is not linearizable. Thus the failure of O in S6 is 
more severe than R-crash. We conclude that X is not a gracefully degrading implementation 
for R-crash, a contradiction which concludes the proof of Theorem 8.2. □ 

The above discussion raises some questions on the “practicality” of the R-crash model: 
Even if “hardware” objects fail by R-crash, “software” objects don’t. The R-omission model 
defined in this paper does not have this serious limitation. In fact, for any t > 0 every 
object type has a t-tolerant gracefully degrading implementation from (universal type, 
register) for R-omission. In other words, implementations preserving the R-omission 
semantics of the underlying system always exist. This is a formal justification for adopting 
the R-omission model of failure. 
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